Goto

Collaborating Authors

 foundation model


Report on foundation model impacts released

AIHub

Partnership on AI has published a progress report on post-deployment governance practices pertaining to foundation models. The document, entitled " 2026 Transparency Report on Foundation Model Impacts ", measures the progress of 13 foundation model providers* in publicly documenting the impacts of their foundation models. In carrying out their analysis, authors Jacob Pratt and Albert Tanjaya reviewed more than 150 papers, articles, websites, and reports. For assessment, these four practices were broken down into 19 processes, or activities, that support how foundation model providers adopt practices. Although several leading organizations are defining what information to share and how, the rest are slow in adopting information-sharing practices.


AI for Science – from cosmology to chemistry

AIHub

On the 31st March, our editorial team headed to the Royal Society for AI for Science . This day-long conference explored how AI is changing the nature of scientific discovery, and was hosted by the Fundamental Research team from the Alan Turing Institute. Nestled in a terrace of 19th century townhouses along the banks of the Thames, the Royal Society looks as grand as the names who have passed through its doors throughout the years. Prof Jason McEwen, Chief Scientist for the Turing Institute, opened the event with an insightful talk on the nature of scientific revolution, and how the bidirectional relationship between AI and science could spark the next one. Then, Prof Anna Scaife from the University of Manchester spoke on the use of foundation models for astronomical discovery.


Non-Stationarity in the Embedding Space of Time Series Foundation Models

Choi, Jinmyeong, Shook, Brad, Dubrawski, Artur

arXiv.org Machine Learning

Time series foundation models (TSFMs) are widely used as generic feature extractors, yet the notion of non-stationarity in their embedding spaces remains poorly understood. Recent work often conflates non-stationarity with distribution shift, blurring distinctions fundamental to classical time-series analysis and long-standing methodologies such as statistical process control (SPC). In SPC, non-stationarity signals a process leaving a stable regime - via shifts in mean, variance, or emerging trends - and detecting such departures is central to quality monitoring and change-point analysis. Motivated by this diagnostic tradition, we study how different forms of distributional non-stationarity - mean shifts, variance changes, and linear trends - become linearly accessible in TSFM embedding spaces under controlled conditions. We further examine temporal non-stationarity arising from persistence, which reflects violations of weak stationarity due to long-memory or near-unit-root behavior rather than explicit distributional shifts. By sweeping shift strength and probing multiple TSFMs, we find that embedding-space detectability of non-stationarity degrades smoothly and that different models exhibit distinct, model-specific failure modes.


Noise Immunity in In-Context Tabular Learning: An Empirical Robustness Analysis of TabPFN's Attention Mechanisms

Hu, James, Ghelichi, Mahdi

arXiv.org Machine Learning

Tabular foundation models (TFMs) such as TabPFN (Tabular Prior-Data Fitted Network) are designed to generalize across heterogeneous tabular datasets through in-context learning (ICL). They perform prediction in a single forward pass conditioned on labeled examples without dataset-specific parameter updates. This paradigm is particularly attractive in industrial domains (e.g., finance and healthcare) where tabular prediction is pervasive. Retraining a bespoke model for each new table can be costly or infeasible in these settings, while data quality issues such as irrelevant predictors, correlated feature groups, and label noise are common. In this paper, we provide strong empirical evidence that TabPFN is highly robust under these sub-optimal conditions. We study TabPFN and its attention mechanisms for binary classification problems with controlled synthetic perturbations that vary: (i) dataset width by injecting random uncorrelated features and by introducing nonlinearly correlated features, (ii) dataset size by increasing the number of training rows, and (iii) label quality by increasing the fraction of mislabeled targets. Beyond predictive performance, we analyze internal signals including attention concentration and attention-based feature ranking metrics. Across these parametric tests, TabPFN is remarkably resilient: ROC-AUC remains high, attention stays structured and sharp, and informative features are highly ranked by attention-based metrics. Qualitative visualizations with attention heatmaps, feature-token embeddings, and SHAP plots further support a consistent pattern across layers in which TabPFN increasingly concentrates on useful features while separating their signals from noise. Together, these findings suggest that TabPFN is a robust TFM capable of maintaining both predictive performance and coherent internal behavior under various scenarios of data imperfections.


Benchmarking Tabular Foundation Models for Conditional Density Estimation in Regression

Izbicki, Rafael, Rodrigues, Pedro L. C.

arXiv.org Machine Learning

Conditional density estimation (CDE) - recovering the full conditional distribution of a response given tabular covariates - is essential in settings with heteroscedasticity, multimodality, or asymmetric uncertainty. Recent tabular foundation models, such as TabPFN and TabICL, naturally produce predictive distributions, but their effectiveness as general-purpose CDE methods has not been systematically evaluated, unlike their performance for point prediction, which is well studied. We benchmark three tabular foundation model variants against a diverse set of parametric, tree-based, and neural CDE baselines on 39 real-world datasets, across training sizes from 50 to 20,000, using six metrics covering density accuracy, calibration, and computation time. Across all sample sizes, foundation models achieve the best CDE loss, log-likelihood, and CRPS on the large majority of datasets tested. Calibration is competitive at small sample sizes but, for some metrics and datasets, lags behind task-specific neural baselines at larger sample sizes, suggesting that post-hoc recalibration may be a valuable complement. In a photometric redshift case study using SDSS DR18, TabPFN exposed to 50,000 training galaxies outperforms all baselines trained on the full 500,000-galaxy dataset. Taken together, these results establish tabular foundation models as strong off-the-shelf conditional density estimators.


Probabilistic Geometric Alignment via Bayesian Latent Transport for Domain-Adaptive Foundation Models

Aueawatthanaphisut, Aueaphum, Auewattanapisut, Kuepon

arXiv.org Machine Learning

Adapting large-scale foundation models to new domains with limited supervision remains a fundamental challenge due to latent distribution mismatch, unstable optimization dynamics, and miscalibrated uncertainty propagation. This paper introduces an uncertainty-aware probabilistic latent transport framework that formulates domain adaptation as a stochastic geometric alignment problem in representation space. A Bayesian transport operator is proposed to redistribute latent probability mass along Wasserstein-type geodesic trajectories, while a PAC-Bayesian regularization mechanism constrains posterior model complexity to mitigate catastrophic overfitting. The proposed formulation yields theoretical guarantees on convergence stability, loss landscape smoothness, and sample efficiency under distributional shift. Empirical analyses demonstrate substantial reduction in latent manifold discrepancy, accelerated transport energy decay, and improved covariance calibration compared with deterministic fine-tuning and adversarial domain adaptation baselines. Furthermore, bounded posterior uncertainty evolution indicates enhanced probabilistic reliability during cross-domain transfer. By establishing a principled connection between stochastic optimal transport geometry and statistical generalization theory, the proposed framework provides new insights into robust adaptation of modern foundation architectures operating in heterogeneous environments. These findings suggest that uncertainty-aware probabilistic alignment constitutes a promising paradigm for reliable transfer learning in next-generation deep representation systems.


Interview with AAAI Fellow Yan Liu: machine learning for time series

AIHub

Each year the AAAI recognizes a group of individuals who have made significant, sustained contributions to the field of artificial intelligence by appointing them as Fellows. Over the course of the next few months, we'll be talking to some of the 2026 AAAI Fellows . In this interview, we met with Yan Liu, University of Southern California, who was elected as a Fellow . We found out about how time series research has progressed, the vast range of applications, and what the future holds for this field. Could you start with a quick introduction to your area of research?


Towards Federated Foundation Models: Scalable Dataset Pipelines for Group-Structured Learning Zachary Charles

Neural Information Processing Systems

We introduce Dataset Grouper, a library to create large-scale group-structured (e.g., federated) datasets, enabling federated learning simulation at the scale of foundation models. This library facilitates the creation of group-structured versions of existing datasets based on user-specified partitions, and directly leads to a variety of useful heterogeneous datasets that can be plugged into existing software frameworks. Dataset Grouper offers three key advantages. First, it scales to settings where even a single group's dataset is too large to fit in memory. Second, it provides flexibility, both in choosing the base (non-partitioned) dataset and in defining partitions.


Segment Anything in 3D with NeRFs

Neural Information Processing Systems

We refer to the proposed solution as SA3D, for Segment Anything in 3D. It is only required to provide a manual segmentation prompt ( e.g., rough points) for the target object in a single view, which is used to generate its 2D mask in this view with SAM.


GV-Rep: A Large-Scale Dataset for Genetic Variant Representation Learning

Neural Information Processing Systems

The development of deep learning approaches for modeling these multifactorial effects of GVs is still in its nascent stages, primarily due to the lack of comprehensive datasets that capture the intricate relationships between GVs and their downstream effects on complex traits.